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ABSTRACT 

The geni^ral purpose of this study wan to determine 
the effects of sitnilariti<r!s between stems and keyed choices on test 
difficulty. Unliki? previous investigations of this undesirable 
characteristic of ocme multiple-choico it<::ns, the present study 
etriploy<?d items that were* unintentionally faulty und satncles of 
exatnine<2s who were highly -experienced test-takers. The first phase of 
the study indicated that the presence or absence of similarities did 
not significantly affect examinees' scores. Ihe second phase, 
however, indicated that exajninees easily may be trained to recognize 
and use such similarities to advantege. (Author) 
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ABSTRACT 



> The ganernl purpose of this study uas to detarniinB the of facts 

^^^3 similarities batLjaan stems and keyed choices on test difficulty. 

Unlike previous invastigaticns of this undesirable characteristic 
of seme multipla-choice items, the present study employed items 
that tjere unintentionally faulty and samples of examinees uho uare 

^ highly experienced test-takers. The first phase of the study in- 

dicated that the presence or absence of similarities did not aignifl- 
^ cantly affect examinees' scores. The second phase, houaver, indicated 

Cr*^ that GxamlnQBQ Qsaily may bo troinod to recognizo end use such aln- 
llaritloa to odvantags. 
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Usu of Similarities istLSH^^ St^is a-^r r^ysd Choices 
in Kulti::l2-ChDica It^" * 




A rccnnt survey of tne litersturs inciicstsd -f^Bt nine experts 
nugc23t that itcm-uri tcrs linaulr: ^iiir.inatEB suosrficial oinilnri ti:3r. 
bstueen the atoms and keyed chai^g^ in nultipls-croicc items 
(r'asaniSt T97Q). The rnost corr^mcn rypa cf similarity iu n result of 
repeating ucms portion of thf> 5tc"i th2 <-:?v5t c^ics but not in 
thf^ distractnrs. Such sinilaritias, it h2s trss"; rssaaned, nay hsip 
uninforned oxamins^s select key2d choices. 

ThG effects of violating this item-uriting principle and thus 
creatin^i "faulty" tost items have besn investigated by means of tuo 



approaches. First, in studies of ^^test-uisersss, " items have been 
uritten about fictitious material so tnat there 'jes no correct response 
from a scholarly point of vieu. In aech cf ths itsna, one choice that 
yas nimilar to its respective stem uas included. Examinees in th'3S9 
studies (e.g. I Diamond and Evans, 1972; Slakter, 1970a, b) were aujardsd 
a point for " tnst-ujisensss^' each ti::^e thay selected a choice that ujas 
aimilar to the stem. In general, it ha3 been fauna that at least seme 
Gxamineas tend to mark a greater ru-nrar of such cr!::icD3 than uauld be 
expected to mark thnm cn the basis of random guessing alone. Sprjcif i-ally , 
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'jubjECtr, uith higher irtalligsncG-tcst bcotbd (Uiancnd and Evanu, 1572) 

numter of nuch choices. FurthernorB, Sloktcr (197Cb) fcund thot 
sxaminess can be trained suicuessfully to sslcct these chaicaB in 
response to the npecial types of items used in these studies. The 
artificial nature cf z'^.a special items, ^'.cuQvar^ makes it difficult to 
Qantralize to exa.ninae tahavior in real test situations. Specif callyi 
the items used in the studies cited abavs differed frari real test itcns 
in tuo potentially incartar.i rasprMi. First, similarities uare 
intentionally incorporated in thq special items used in these studies 
and thus may be different in nature and degree frcm thoEa that unin- 
tentionally occur in test items. Secandly, the items used in these 
studies had no "correct" response since the items dealt uith fictitious 
material. Thus, partial knouledge could not be used by examinees ujhan 
responding to the items. In real test situations, examinees often are 
able to use partial information (and, of course, adequate information) 
to determine their responses. Thus, the studies cited above do not 
indicate whether examinaas rely upon superficial clues uhen they have 
some knowledge of the points in question. 

Using a different acproach, McHcrris £t nl_. (1972), as part of a 
largor study, incorporated similarities in seven itams of reasonable 
difficulty. When the criginal items and the modified items uere admin- 
Idtarcd to diffnrc^t ^r^-^'i of axa,"nin?ies , a small but tstatisticnily 



Tne pressncG or nbsnrct^ of 3in:l:3riti2H riid rat £,igni ficGr.tly tjf t'riLrt 
citt^BT reliability or viilidity. 

This study uas canductsd in tuc pnases. The nathadolagy of both 
phases wbs similar to that U3ed by McMcrris £t It uats dif ferant 

frcm all the previously mentioned i'^^ ES-i^ations , hcuevar, in tua 
important respactG- Firat, tha frjjltv test itsms f^mployed in thin 
study uere obtained frcm a published t^3t. As noted above, in praviaus 
St lias, similarities uiern kncuiingly incorporated in items by the 
investigators. Thus, ^ higher degree of rener^l izr.ticn can be abtaired 
frcm this study regarding the effects af faults that unintenticnally 
occur in test items. Secondly, the examinees in this study ap;:arently 
uiere much more experienced as test-ta*<2r3, and thus should exhibit a 
higher degree of test-uiseness than tne examinees in previous studies. 

PHASE I 
Hetrad 

Subjects 

The sample consisted of 113 undergraduate and graduate students 
enrolled in courses in educational rei-search and measurement. Many of 
tham had had Rxperience writing tests as uell as extensive experience 
taking them. They uern tested, houaver, prior to specific discussions 
in their research and measurement courses of tcpics relevant to the 
current ln^7esti(jatinn . 



Ten cr^i:;;larv u^:L2t. ij.^ini -r"::i ..i: i";--] :.v rt::iJi'n'L IiJ 

th3 iti2n-uriting pri^rirle under ccr..'^i::2rHticn uare icer.tiflcd in a 
published reading tss^ appropriate for use uith udults. Spiicif ically , 
in four of th2 itens tne first syllablss of t*ie stimulus uorda tjsre 
identical !jith th2 fir=t syllables cf kayna choices (e.g., "pre-'*); 
in four of the items -hs last syllables usrs identical (e.g., "-soma"); 
and in two of the itens the last thraa letters uera idantical Ci.a#, 
"-ous"). Only one of rha ^0 distractsrs in f^ase itans^ furtharmora, 
contained such a sinilarity uith its raspactiva stimulus uard. This 
uas true despite the fact that it has baen suggested that distractars 
in vocabulary items that have superficial similarities uith their 
respective stimulus ucrds may be effective in attracting examinees uho 
Ho not knou the maaninns of the words in question (e.g., Kelly, 1937; 
Pyrczak, 1971). 

A second ten-itan form of the test L'as canstructud by 
changing the keyed choices in the ten original items so that the 
superficial oimilaritias batueen tha stems and tha keyed choices ujera 
eliminated. In this prccass, care uas taken to obtain keyed choices 
that uere thoroughly adsquata and that uara at about tha 3am« leval 
of readability as tha original keyed choices. These considaraticns 
pracludad tha use of cna additional faulty item that had baen idantifiad 
In tnri publishud t23t ui^ca an adaquH-^a sjbsrituta for t^a kevnd choice 
^ could not be writtan. 
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ujas designsd to determins kihether siinilBr examinGss can be trained to 
take advantage of the fault undsr irwastigation*. 

Method 

Subjects 

The sample consisted of 76 undergraduate and graduate students ' 
enrolled in courses in educationel research and measurement. 

Instrumentation 

•Tiijo forms of a vocabulary test uere constructed. Bnth contained 
the faulty vocabulary items identified in the published reading test. 
The only difference betueen the tuo forms uas in the directions. For 
one form, the directions merely indicated that the examinees should 
"...circle the letter of the choice that most nearly means the same 
thing as the underlined uord." For the other form the directions also 
indicated that uhen in doubt about the meaning of a given uord, it 
uould pay to select a choice uiith a similar beginning or ending as the 
under] ined ujord. 

Procedures 

A random half of the subjects Luas administered the form of the 
test uith the standard directions. The other half uias administered the 
form uith special " test-uiseness" directions. 



Uhila it uDult! n:^vt» hosn dssira^ls ta ^^ava U3t2d larger rL-;:z;sr 
cf itens; thx3 2riv2ntr?g2 oT ^ sing rE^i ita'^.s in u*.^ich faultG u*i2r2 jnin- 

number of itcns. It is irturaating -a rcta, furthGrmors, tr^at t^a 
numbt^r of itcuis uith thio particular fault tj23 greater in ths present 
study than in the studies cited previously. 

Proneduretj 

A random half of the subjects uas administBred the forn consisting 
of the ten faulty itcns they originBlly e^psared in the published 
test. The other half uas arixiinisterad the forn cf the test consisting 
of the sa.Tie ten items in their fault-frca form. 

Results 

The mean and standard deviation for the form of the test consisting 
of the original items were 7.66 and 1,50, respecfcivsly . On tha fault- 
free form they bjera 7.35 and 1.S1, respectively. Tha difference betuaen 
the means uas not statistically significant at the .05 level. 

PHASE II 

The negative results of tha first phase of this study suggest tMat 
the prest^nce or nbsance of similarities bettjean the stems and keyed 
choices in vocabulary items does not affect test difficulty evan for 
highly expnricnced examinees. The ssccnd phaof2 of the investigation 



Ihn trff^n ar>6 3trindard d^viatlc^. ^~''zt ths fern uith ^tan-rj^rd 
dir£:':tionn tJ'-re^ 7.53 ^3tid 1.-^, r'.L ^: V;' i v , i ?:r It:^ fnrr ih\: 
" tOi;t-ui3iinf23L3'^ dirsctianis, they u^srs 5.53 and 2*02, rsispt^ctivcly . 
ThG diffGrencs batuesn the rsans uan ststit^tic^lly significant at the 
•01 li2vel. 

OiG-LB^icn 

Phana I of this Inve^tigaticn ir^^ic^tBs th^t the prGSoncG or 
abGF^nca of supcrficinl Gimllariti^s -^Vj^jen kcysd chaics.^ and thair 
raisDcztiva stems tzns nit affacc ciifficulty af vocabulary tasts 
for examinees uho have nat had spacial training. Thlt3 finding is 
aspscially interesting in light of the fact that highly axparienced 
exaniinees uera used in this investigation. Those attempting the faulty 
items, furthermore, marked correct choices to enough itDms that it 
theoratically uas possible for tham to develop a cue-using strategy of 
marking choices with similarities uhen they did not knou the meaning 
of a given stimulus ujord. The saccrd phase of this study, houever, 
indicates that examinaes Rasily can ba trained to take advantage cf tnc 
fault undsr investigation. 

Thn implication of the first phaia of this study for item writers 
is that ths itam-wrlting principla under cansiderut Ion nriy hp. of little 
practical Importance whan uriting vccabulary items far nxr^minues uho 



hava not had spscial training. Since ths SECond phase of this study 
indicates that examinees Basiiy may be trained to take advantage of 
violations of the "similarity principle," houiever, it seems desirable 
for item-urit ers to avoid uriting keyed choices that are physically 
similar to their respective stems. As noted above, houever, it is not 
aluays possible to provide a thoroughly adequate keyed choice without 
violating this principle. liJhen this is the case uith respect to a 
given item, it seems desirable for item-uriters to consider the possi- 
bility of providing plausible distracters that also are similar to the 
stimulus uard. 
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